Speech emotion recognition via learning analogies

نویسندگان

چکیده

This work introduces the few-shot learning paradigm in speech emotion recognition domain. Emotional characterization of segments is carried out through analogies, i.e. by assessing similarities and dissimilarities between novel known recordings. More specifically, we designed a Siamese Neural Network modeling such relationships on combined log-Mel temporal modulation spectrogram space. We present thorough experimentations performance proposed solution holistically, where it demonstrated that reaches state art rates when following standard leave-one-speaker-out protocol, while at same time being able to operate non-stationary conditions, with limited knowledge speakers and/or emotional classes. Finally, investigated activation maps layer-wise manner order interpret predictions made model.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Emotion Recognition via Continuous Mandarin Speech

Emotion plays a significant role in cognitive psychology, behavioural sciences and humanoid robot design. The continuing improvements in speech recognition technology have led to many new and fascinating applications in human-computer interaction, context aware computing and computer mediated communication. A growing number of research studies in emotion recognition via an isolated short senten...

متن کامل

Active learning for dimensional speech emotion recognition

State-of-the-art dimensional speech emotion recognition systems are trained using continuously labelled instances. The data labelling process is labour intensive and time-consuming. In this paper, we propose to apply active learning to reduce according efforts: The unlabelled instances are evaluated automatically, and only the most informative ones are intelligently picked by an informativeness...

متن کامل

Feature Transfer Learning for Speech Emotion Recognition

Speech Emotion Recognition (SER) has achieved some substantial progress in the past few decades since the dawn of emotion and speech research. In many aspects, various research efforts have been made in an attempt to achieve human-like emotion recognition performance in real-life settings. However, with the availability of speech data obtained from different devices and varied acquisition condi...

متن کامل

Representation Learning for Speech Emotion Recognition

Speech emotion recognition is an important problem with applications as varied as human-computer interfaces and affective computing. Previous approaches to emotion recognition have mostly focused on extraction of carefully engineered features and have trained simple classifiers for the emotion task. There has been limited effort at representation learning for affect recognition, where features ...

متن کامل

Speech Emotion Recognition Using Scalogram Based Deep Structure

Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Pattern Recognition Letters

سال: 2021

ISSN: ['1872-7344', '0167-8655']

DOI: https://doi.org/10.1016/j.patrec.2021.01.018